regular image
Strike a Balance in Continual Panoptic Segmentation
Chen, Jinpeng, Cong, Runmin, Luo, Yuxuan, Ip, Horace Ho Shing, Kwong, Sam
This study explores the emerging area of continual panoptic segmentation, highlighting three key balances. First, we introduce past-class backtrace distillation to balance the stability of existing knowledge with the adaptability to new information. This technique retraces the features associated with past classes based on the final label assignment results, performing knowledge distillation targeting these specific features from the previous model while allowing other features to flexibly adapt to new information. Additionally, we introduce a class-proportional memory strategy, which aligns the class distribution in the replay sample set with that of the historical training data. This strategy maintains a balanced class representation during replay, enhancing the utility of the limited-capacity replay sample set in recalling prior classes. Moreover, recognizing that replay samples are annotated only for the classes of their original step, we devise balanced anti-misguidance losses, which combat the impact of incomplete annotations without incurring classification bias. Building upon these innovations, we present a new method named Balanced Continual Panoptic Segmentation (BalConpas). Our evaluation on the challenging ADE20K dataset demonstrates its superior performance compared to existing state-of-the-art methods.
SpliceMix: A Cross-scale and Semantic Blending Augmentation Strategy for Multi-label Image Classification
Wang, Lei, Zhan, Yibing, Ma, Leilei, Tao, Dapeng, Ding, Liang, Gong, Chen
Recently, Mix-style data augmentation methods (e.g., Mixup and CutMix) have shown promising performance in various visual tasks. However, these methods are primarily designed for single-label images, ignoring the considerable discrepancies between single- and multi-label images, i.e., a multi-label image involves multiple co-occurred categories and fickle object scales. On the other hand, previous multi-label image classification (MLIC) methods tend to design elaborate models, bringing expensive computation. In this paper, we introduce a simple but effective augmentation strategy for multi-label image classification, namely SpliceMix. The "splice" in our method is two-fold: 1) Each mixed image is a splice of several downsampled images in the form of a grid, where the semantics of images attending to mixing are blended without object deficiencies for alleviating co-occurred bias; 2) We splice mixed images and the original mini-batch to form a new SpliceMixed mini-batch, which allows an image with different scales to contribute to training together. Furthermore, such splice in our SpliceMixed mini-batch enables interactions between mixed images and original regular images. We also offer a simple and non-parametric extension based on consistency learning (SpliceMix-CL) to show the flexible extensibility of our SpliceMix. Extensive experiments on various tasks demonstrate that only using SpliceMix with a baseline model (e.g., ResNet) achieves better performance than state-of-the-art methods. Moreover, the generalizability of our SpliceMix is further validated by the improvements in current MLIC methods when married with our SpliceMix. The code is available at https://github.com/zuiran/SpliceMix.
Introduction to Adversarial Machine Learning
Here we are in 2019, where we keep seeing State-Of-The-Art (from now on SOTA) classifiers getting published every day; some are proposing entire new architectures, some are proposing tweaks that are needed to train a classifier more accurately. To keep things simple, let's talk about simple image classifiers, which have come a long way from GoogleLeNet to AmoebaNet-A, giving 83% (top-1) accuracy on ImageNet. If we were to take an image and change a few pixels on it (not randomly), what looks the same to the human eye can cause the SOTA classifiers to fail miserably! I have a few benchmarks here. You can see how miserably these classifiers fail even with the simplest perturbations. This is an alarming situation in the Machine Learning community, especially as we move closer and closer to adopt the use of these SOTA models in real world applications. Let's discuss a few real-life examples to help understand the seriousness of the situation. Tesla has come a long way, and many self-driving car companies are trying to keep pace with them. Recently, however, it was seen that SOTA models used by Tesla can be fooled by putting simple stickers (adversarial patches) on the road, which the car interprets as the lane diverging, causing it to drive into oncoming traffic. The severity of this situation is very much underestimated even by Elon (CEO of Tesla) himself, while I believe Andrej Karpathy (Head of AI, Tesla) is quite aware of how dangerous the situation is. This thread from Jeremy (Co-Founder of Fast.ai) says it all. In this clip, @elonmusk tells @lexfridman that adversarial examples are trivially easily fixed.@karpathy is that your experience at @tesla? @catherineols is that what the neurips adversarial challenge found? A recently released paper showed that a stop sign manipulated with adversarial patches caused the SOTA model to begin "thinking" that it was a speed limit sign. This sounds scary, doesn't it? Not to mention that these attacks can be used to make the networks predict whatever the attackers want! Imagine an attacker who manipulates road signs in a way such that self-driving cars will break traffic rules.
Mind-reading AI isn't sci-fi anymore... and it's just getting started
Despite its overwhelming success, the human brain peaked about two million years ago. Lucky for us, computers are helping us understand our brains better, but there may be some consequences to giving AI a skeleton key to our mind. A team of Japanese researchers recently conducted a series of experiments in creating an end-to-end solution for training a neural network to interpret fMRI scans. Where previous work achieved similar results, the difference in the new method involves how the AI is trained. An fMRI is a non-invasive and safe brain scan similar to a normal MRI.
On the Limitation of Convolutional Neural Networks in Recognizing Negative Images
Hosseini, Hossein, Xiao, Baicen, Jaiswal, Mayoore, Poovendran, Radha
Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance on a variety of computer vision tasks, particularly visual classification problems, where new algorithms reported to achieve or even surpass the human performance. In this paper, we examine whether CNNs are capable of learning the semantics of training data. To this end, we evaluate CNNs on negative images, since they share the same structure and semantics as regular images and humans can classify them correctly. Our experimental results indicate that when training on regular images and testing on negative images, the model accuracy is significantly lower than when it is tested on regular images. This leads us to the conjecture that current training methods do not effectively train models to generalize the concepts. We then introduce the notion of semantic adversarial examples - transformed inputs that semantically represent the same objects, but the model does not classify them correctly - and present negative images as one class of such inputs.